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The error performance of the ensemble of typical LDPC codes transmitted over the binary erasure channel 
(BEC) is analyzed. In the past, lower bounds on the error exponents were derived. In this paper a probabilistic 
upper bound on this error exponent is derived. This bound holds with some confidence level. 

Index Terms: Block codes, error exponent, expurgated ensemble, stopping sets, low-density parity-check 
O . (LDPC) codes, iterative decoding, binary erasure channel (BEC). 

^ ! I. Introduction 

\D 

^ , Low-density parity-check (LDPC) codes, discovered by Gallager [1], have been widely researched over 

C<") ■ the last decade and a half. Asymptotic results are widely known for these codes, including results on the 

qq ' performance under maximum-likelihood (ML) decoding [1], [2], [3], [4], [5], average ensemble distance 

spectra [1], [6], [7], [8], [9], stopping set distributions [7], [8], [9], [10], thresholds for iterative decoding 

using density evolution [11], [12], and others. However, accurate finite-length analysis of LDPC codes 

, 

j_j ■ under iterative sum-product decoding is currently available only for the binary erasure channel (BEC) [13]. 

This is due to the simplicity of the channel model and the graph-based iterative decoder which lends 
itself to a more detailed analysis. Analysis of the combinatorial properties of stopping sets and their 
contribution to the error performance reveals that the average error performance of the LDPC ensemble 
is proportional to the inverse of a polynomial in the block length N [7]. This behavior is attributed to the 
existence of "bad" codes which possess small stopping sets, and otherwise would decrease exponentially 
with N if these codes were removed from the ensemble. Fortunately, these "bad" codes constitute a small 
fraction of the entire ensemble whose size is proportional to the inverse of a polynomial in N. 

After removing the undesirable codes, we obtain an expurgated ensemble, for which there exists a 
positive error exponent. In [7], lower bounds on this error exponent of typical codes in the regular and 
irregular LDPC code ensembles were derived. In this paper we obtain an upper bound on this exponent, 
and compare it with the above mentioned lower bounds. Similar to [5], which considers upper bounds 
on the error exponent of LDPC codes under ML decoding, our bounds depend on some confidence level. 

The correspondence is organized as follows. Section IH introduces notation and preliminary material. 
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Section [TTT] introduces a lower bound on the error (erasure) probability from which an upper bound on 
the exponent is derived. Section [TV] introduces numerical results and comparisons with previous results. 
Section [V] concludes the paper. 

II. Preliminaries 

A. Notation 

We will use the following notation throughout the paper. 

• Let {ai}f =l be a set of non-negative real numbers, such that Yli a i — 1- The entropy function of 
i a i}i=i i s defined as 

h(a l7 . . . ,a k ) = - ^ai log(aj) - I 1 - ^ a t J log I 1 - ^ aA 
i=i V i=i / V i=i J 

where log is the base-2 logarithm. We use the convention log = 0. 

• Given an integer n and integers (m, . . . , n k ) such that J2i n i < n > 

n \ A n\ 



,ni,n 2 ,...,n k J ni \ . n2 \ . . . . . (n-£f =1 n z )! 

is the multinomial coefficient of n over (ni,...,nfc). We will use the following property of 
multinomial coefficients 

log( " )=„(*(=!,..., 2i) -Mi)) CD 

which is easily proven using Stirling's approximation. 
• If p(x) is a polynomial, then we will denote the coefficient of x % by [x l ~\ p(x), i.e, 

p{x) = [ x *] p{x)x % 
i 

The same notation is extended for use with multivariate polynomials, e.g., 

p(x,y,z) = ^2 \^ % y 3 z k p(x,y,z)x l y J ' z k 

B. A Second-Order Inequality for Probabilities 

Dawson and Sankoff [14] obtained a lower bound on the probability of a finite union of events. Their 
result asserts the following. Let {Ai}f£ x be a finite family of events in a probability space (Q, P). Denote 



where / = {!,... ,M}. Then 



i>j 



Si - -^-ttS 2 (2) 
r(r + 1) 
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for any r € {1, . . . ,M - 1}. 

Following the derivation in [14], we derive a result which generalizes (f21). For a probability event A, 
denote by 1{a} to be the indicator (random variable) over A, i.e, for lo G Q, 



!{A}M 

Our result asserts that for all u> G ft, 



l lu g A 
u 4 A 



2 2 

In iM 4.1 > <Sl ; -So (3) 

{u i=1 a,| _ r + 1 r ( r + l) 1 



where 



iei i,jei 

i>j 

By taking the expectation over both sides of (O, we get (f2]) as a special case. We prove ([3]) in Appendix 

ID 

C. LDPC Code Ensembles 

We consider the standard bipartite graph-based (c, d)-regular LDPC code ensemble with block length 
./V and design rate R. In this ensemble a randomly chosen permutation is used to match the cN left 
sockets to the d(l — R)N right sockets. The actual rate of the code is at least R = 1 — c/d. 



III. Upper Bound on Error Exponent for the BEC 



Recall that a stopping set S of a bipartite graph representation of an LDPC code is a set of variable 
nodes, such that each check node neighbor of S is connected to S by at least two edges. As explained in 
[13], iterative decoding of LDPC codes succeeds if and only if the set of variable nodes which correspond 
to erasures does not contain a subset which is a stopping set. 

The expurgated (c, (i)-regular LDPC ensemble C 7 is derived from the (c, (i)-regular ensemble C° by 
removing all the codes containing stopping sets of size jN or less. It was shown in [7] that for ensembles 
with c > 2, if 7 is selected below a certain threshold ao, then almost all codes in C° belong to C 7 . In 
other words, if C is drawn at random from C° 

Pr (C G C r ) = 1 - o(l) V 7 < a (4) 

The number aoN may therefore be considered to be the typical minimum stopping set size of C°. 
Since the behavior of C° is dominated by a small fraction of "bad" codes, we will be interested in the 
performance of codes drawn at random from C 7 . Let C be such a code. 
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Consider a BEC with erasure probability 5; the probability of unsuccessful decoding of any codeword 
from C, Pg is given by 

N 

^E'p-rEito., <5) 

l=-yN m 1 J 

where the index m runs over all sets of variable nodes containing exactly / nodes; for a particular set S m 
of I variable nodes, {^4™} i s ^ ne e vent that the z'th (non-empty) subset of S m (where i = 1, . . . , 2 — 1) 
is a stopping set. Note that every set of N(l — R) + 1 variable nodes contains the support of a nonzero 
codeword^. Hence (since every codeword is a stopping set), every set of iV(l — R) + 1 variable nodes 
contains a stopping set. Therefore, the indicator appearing in the RHS of ([5]) may be replaced by 1 for 
I > N(l — R), which yields 

N(l-K) N , . 

l=-yN m 1 > l=N(l~R)+l V ' 

Next, we use (0 to lower-bound the indicator function in (J6J), giving 

2 2 

(u i= i A i / r z + 1 77(77 + 1) 

where r is allowed to depend on the size of the set, and 

Si = E 5 2 = E E HmHrn ( § ) 

1 = 1 8=1 fe=l 

Consider a stopping set 5 containing /c variable nodes, where k < I. The number of sets of variable 
nodes of size I containing S as a subset is (^_T fc fc )- Hence, again letting m run over all subsets of size I, 
we have 

m j=l fc=l v 7 k=^N v 7 

where Sj: is the number of stopping sets with k variable nodes in C; note that since C belongs to the 
expurgated ensemble, we have = for k < jN. 

In a similar fashion we obtain 

EEE 1 Mr} 1 Mr}= E 

m i=l j=l jN<j<i<l 

0<fe<j+min(j-j-l,0) 

where 5f . fc is the number of pa/rs of stopping sets, (<Si, £2) satisfying = i, IS2I — j> and |Sin«S2| = 
k. Recalling that both <Si and S2 must be subsets of a particular set of size /, their union must also be a 
subset, and therefore |<Si US2I = i + j — k < I. Furthermore, the application of Q requires summing over 
pairs of distinct events. Consequently, we cannot have S\ = S2, i.e., when i = j, we must have k < j; 

'This is tantamount to saying that N(l — R) + 1 columns in the parity check matrix, regardless of how they are chosen, are 
linearly dependent; this follows since the matrix has N(l — R) rows. 
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this requirement is subsumed by imposing < k < j : + min(i — j — 1,0) in (fTUl) . Plugging (T7l)-(fT0l) into 
©, we get 



N{l-R) 



AT- 



E 



N - {i + j - k) 



s: 



c 



0<fc<j'+min(i— j— 1,0) 
i+j— fc<J 



+ E 



Z=JV(l-i?)+l 



N 



S l (l-5) 



N-l 



N(l-R) 

> ^ § Ne {1 _ §) N(l-e) 
l=-/N 



(eN) max 



2 /iV(l - ^ cC 



max 



N(l-( m+m -P)) 



C 



ri(r, + 1) v " ' ' 7<^<m<e \N(e - fai + */2 - /?)) y * N '"> N >0 N 

0</3<»?2 
i?i+»73— /8<e 

l-_R<e<l ViVe/ 



1 



(a) 

> max 

7<e<l- 



r {^(1 - S) N ^Ff(e,N)} + x max <x { W - 6) N ^ 



where 



max 



r eN + 1 7<»?<e V.^ 6 ~ V) 



(eN) max 



N(l-( Vl + V2 -f3)) 



s 



c 



r eN (r eN + 1) ^ ' ' y<v7<v\<e \N(e - ( m + m - /?)) / n^N^N 
0</3<>72 
Vi +V2-l3<e 



(11) 



and e = jj, rj = j^, rji = j*, r/2 = jj, and /3 = a sufficient condition in order for (a) to hold is 
that if (e, N) be non-negative for 7 < e < 1 — R. Later we will choose the value of r e ^ so that this 
condition is fulfilled. 



By expressing the bound in exponential form, we get the following upper bound on the error exponent 

h(e) 1-R<e< 



^ log if <- max ( e log«y + (l-e)log(l-5) + ( ^^ N ) ^iV'Z^ \ + °Q) 



N " " ~ 7<e<l 

where we rely upon (Q]), and 

2 



Pe(e,N) 



-NE' 



-NE' 



r t N + 1 r eN (r eN + 1) 

E[ = — max < (1 — rj)h 



7<»y<e 



1 — 77/ ^ 



max < (1 - (771 + r?2 - 

7<rj2<»?i<e 

0</3<»72 
r?i+??2-^<e 



(12) 
(13) 
14) 



Let C be a randomly selected code from C°, and let Si and Sij : k be the averages, over C°, of Sf and 
Sf'j k , respectively. We evaluate these average quantities and then relate them to Sf and Sfj J^. In order 
to evaluate these quantities, we introduce the following notation. 



ipi(x;d) 



Hfy-v+'f-Y.ify <15) 

l=i V J 1=0 V 7 



¥lf_f_( X ,y,z,d) = £ ( ' W* (16) 



j" <j<7 + 
fc_<fc<fc + 

i+j+k<d 



The average quantities satisfy 



^ = (fW« (17) 

where P 3 i(i) is the probability that a specific set of variable nodes, 5, is a stopping set, and P Si 2(i,j, k) 
is the probability that a specific pair of sets - S\ containing i variable nodes and 52 containing j variable 
nodes, with |<Si n «S2 1 = k, axe both stopping sets. 

To evaluate P s ,i(i), we need to fix a set S of i variable nodes and count the number of possibilities 
of connecting their ic variable sockets to ic check sockets such that each of the L check nodes is either 
(a) not connected to any of the ic variable sockets, or (b) connected by at least two check sockets. This 
combinatorial problem can be solved by means of the enumeration function in (fT5l ). The total number 
of ways to connect ic variable sockets to Nc check sockets is (^ c c ) , therefore 

[x ic ] {l + i) 2 {x,d)) L 



Ps,i(i) 



We proceed with the evaluation of P a ^(i,j,k). Given two sets Si and 52 of variable nodes with 
|<Si| = i, |S2 1 = j, \Si Pi 52 1 = k, we need to count the number of possibilities of connecting (i — k)c 
sockets from S1/S2, kc sockets from 5i n 52 and (j — k)c sockets from S2/S1 to (i + j — k)c check 
sockets, such that both S\ and 52 are stopping sets. This situation is depicted in Figure [TJ Consider a 
check node a in the graph. From the definition of a stopping set, it can be seen that in order to have 
both 5i and 52 as stopping sets, a has to fall into one of the following disjoint categories: 

• a is not connected at all to nodes in S\ U 52. 

• a is connected by at least two edges to nodes in S1/S2 and is not connected to nodes in 52- 

• a is connected by at least two edges to nodes in S2/S1 and is not connected to nodes in S\. 

• a is connected by at least two edges to nodes in S\/S 2 and by at least two edges to nodes in S 2 /S\, 
but is not connected to any node in S\ n 52. 

2 recall that in our context C is selected uniformly from C 1 
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Fig. 1. Two intersecting stopping sets and a check node a 



• a is connected by exactly one edge to a node in S\ (IS2, and by at least one edge to nodes in S1/S2 
and in S2/S1. 

• a is connected by at least two edges to nodes in S\ D S2. 

This combinatorial problem can be solved using the enumeration function given in ([16l ). The total number 
of possibilities of connecting (i — k)c sockets from S1/S2, kc sockets from Si PiS2 and (j — k)c sockets 
from S2/S1 to Nc check sockets is ((^^^jjuj' Therefore, 



Ps,2(i,j,k) 
B(x,y,z,d) 



x (i-k) C ykc z (j-k)c 



B(x,y,z,d) 



Nc 



i — k)c, kc, (j — fc)cy 

1 + *2Ao( x > y > Z ' d ) + ^0A2( X ' V, z i d ) + *2A2°' d_2 ( a; ' z > d ) 



(19) 



We turn our attention back to the relation between the average quantities Si and <SVj,fc and those of the 
randomly selected code, Sf and Sf ■ k . By assuming that C is selected at random with uniform probability 



from C° and using conditioning, we have 

Pr(Sf jjtk >NS iij!k \eeC^ = 



Pr [SLk > NS itjJl ) - Pr (C j Cn, > NS iJJt 
Pr (C G CT) 



(a) Prfe >NS uA (b) 1 



* ^l-o(l) (20) 
where (a) is obtained using (|4]) and by omitting the negative term, and (b) is due to Markov's inequality. 
We conclude from (f20b that w.p. (with probability) 1 — o(l), for C chosen randomly with uniform 
probability from C 7 , 

llog^. fc <llog5 M - fc + o(l) (21) 



By using conditioning once more we obtain 



S c 

Pr ( 1 - e < < 1 + e 

Sj. 



Pr ( 1 - e < f- < 1 + ej - Pr (C <£ C 7 ) 
Pr(C EC 7 ) 



( a ) / 5 C 
> Prfl -e< ^- < 1 + e) + o(l) 



where (a) is obtained by using (0]) and replacing the denominator by 1. 

Rathi [8] has obtained a concentration result on the stopping set distribution. His result implies the 
following. For any e > 0, 

Pr fl_ e <£k<l + e] >l_fe + (l) (23) 
y SnN J e 

where /3 Vi d,c is a constant given in Eq. (l37l ) in Appendix HH independent of N, which satisfies P v ,d,c —> 

c 

d 



when d — > oo and | is kept constant. By setting e — > 1 in (1231 and using (T221 . we conclude that w.p. at 
least 1 — Sn^is. _|_ o(l), for C chosen randomly with uniform probability from C 7 , 



Define 



#L 4 - max <j (1 - n)h ( ^— ^ + ^ log 5^ \ (25) 



7<»7< e t \ 1 — 77 y iV 



£ 2 ^ - max ( (1 - ( Vl + m - 0))h ( € ( /f + l 2 ® )+^ log S^n^pn) (26) 

7<7? 2 <?? 1 <e [ yi - (771 + 772 - p)y N J 



0</3<??2 
»?i+f?2-/3<e 



then by combining (fl2l . ([131 . (fl4l) . (1211 ) and (|24T) . we obtain that, w.p. at least 1 - ^£ + o(l), 

P c (e N) > 2 2~ N(El+o(1)) 2 2 -iV(E 2 +o(i)) f27 % 

e V ' ' ~ r eN + 1 r e7V (r e ^ + 1) 

As we are interested in the asymptotic behavior of E\ and E2 (and thus the exponential growth rate 
of the stopping set distributions), we use [7, Theorem 2], which asserts the following^: 

3 Here we give the multivariate version of the theorem with 3 variables; the theorem generalizes to any number of variables. 
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Let p(x, y, z) be a trivariate polynomial with non-negative coefficients. Let a\ > 0, a 2 > and > 
be some rational numbers and let rii be the series of all indices such that 

[x aini y a2ni z a3ni )p(x,y,z) n * 

Then 



]im Uog[x^y a ^z a ^]p(x,y,zr = M log ( ^| (28) 
Using dnl), (HU), (EU), (EU) and (ED) we obtain 

* - - M£) -_{ eft (5)_ cM , ) + £ Mlog (i±^))} 

#2 = -h(e) - max \ eh ( — — -,— — — , — J — c/t (771 — /3, 772 — (3, P) 



0</3<??2 
0<J?i+772-/3<e 



, c ^ B(x,y,z,d) 

H — ml log 



dx,y,z>0 \x^i-P)dyl3d z (ri 2 -l3)d 

If > -^l, we choose r e jv = 1 in ( f2Tb . In this case, taking the union bound over all possible 
stopping sets yields an exponentially tight bound. In the case that E2 ^ E\, we use (1271) with t € n — 
^2N{E 1 -E 2 +a) ^ w h ere a > can be made arbitrarily small (hence, the non-negativity of P£(e,N) in 
(fTTT) is established). Thus, we obtain the following upper bound on the error exponent 

->sff < -m«{ El „g a+ (i- E)1 „g(i- a )-{ E M£) 1 1 ^ R % 1 -/ 1 } + om 

E = I El E 2 >Ei 
2E\ — E2 E2 < Ei 

This bound holds w.p. at least 1 — ^ V0 2 d - C + o(l), where rjo is the maximizing value of 77 in 



IV. Numerical Results 

In this section, we compare our upper bound on the error exponent of the BEC with previously-known 
lower bounds. These bounds were derived in [7, Theorems 8,12]; one of these bounds applies for iterative 
decoding, while the other applies for ML decoding. 

In Figure [2] we exemplify our bound for the regular (4, 8) LDPC ensemble. Recalling that the bound 
applies with a certain probability, we have marked the plot where the bound has a confidence level above 
99%. We note that the entire plot of the upper bound is true w.p. at least 70%. 

Figure [3] shows the confidence level bound from (1231 ) which corresponds to the upper bound plot in 
Figure [2] Looking back at Figure [2] for low values of 5, the upper bound on the exponent coincides with 
the two lower bounds from [7, Theorems 6,8]. That is, our results indicate that in the region 5 G [0, 0.17], 
the bound on the error exponent of the expurgated ensemble in [7, Theorem 6], which coincides with 
the bound in [7, Theorem 8] in this region, is tight. Similarly, for the (3, 6) ensemble and 5 € [0,0.26], 
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Fig. 2. Error exponents for the regular (4,8) LDPC ensemble. 
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Confidence level bound for the regular (4,8) LDPC ensemble. 
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the lower bound on the error exponent of the expurgated ensemble in [7, Theorems 6] (which coincides 
with the lower bound in [7, Theorem 8] in this region) is tighjf . 

Focussing on higher values of 5 where the confidence level is higher, comparison of our upper bound 
with the lower bound on the ML decoding exponent reveals that there is a gap in performance between 
iterative and ML decoders, at least for most codes in the ensemble. 



V. Conclusion and Further Research 



We have derived an upper bound on the error exponent of LDPC codes transmitted over the BEC. The 
upper bound relies on Dawson's inequality and holds with a certain confidence level. It was demonstrated 
that for some values of the channel erasure probability there is a gap between our upper bound and some 
previously reported lower bounds. 

Continued research could focus on extending our results to irregular ensembles of LDPC codes. This 
requires to extend the results of [8], regarding concentration of stopping sets, to irregular codes. Another 
possible avenue is to try and bridge the gap between the lower and upper bounds; with the asymptotic 
decoding threshold for the (4, 8) ensemble at about 0.38, there is still room for improvement in the 
bounds. 
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4 We note that these lower bounds, as depicted in [7, Figure 3] do not coincide with each other in this 8 region due to a 
numerical inaccuracy. 
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Appendices 

Appendix I 
Proof of © 

Given the events A\, . . . , Am define the set B s , s = 1, . . . , M as the set of points in Ijfii M contained 
in exactly s sets. We thus have 

AI M 

k=l k=l 
M M fc-1 

ELfW = EE 1 M4 1 M ! } = ^2 (32) 

fc=2 V 7 fc=l i=l 

We will find a lower bound for 

M 



F = 1 {U^ 1 A 1 }=E 1 ^} (33) 

k=l 

First, fix the value of r. Solving (|3"TT) and (l32l) to isolate lj^} an d l{B r+1 } we g et 
, „ 25 2 ^ fc(r + 1 - fc) 



(34) 



k=2 
k^r 



1 i r-1 2S2_ r - 1 ^ k{k-r) 

1 {B r+1 } — 1 {B 1 } — r~r H ~~; <->i — — 7 ~ /_, 1 {B k } — r~ (3->) 

fc=2 



Substituting (|34J> and (|35]> into d33J we get 



M 



25i 2S 2 r-1 (r-fc)(r-fc + l) 

k=2 

Note that the RHS of (l36l) contains only non-negative elements. Thus, if the RHS of (l36l) is replaced by 
zero, we obtain the inequality 

2 2 
r + 1 r(r + 1) 

which is the desired result. 



Appendix II 

Confidence Interval of Stopping Set Distribution 

Rathi [8] has obtained a result asserting the concentration of the stopping set distribution. To state his 
result, we introduce some notation. 



• Denote (3(x) = 1 + ip2{x,d), where ip is denned in (fl"5l) . 

• The equation 

(1 +x) d ~ l - 1 
x 7v~\ = *l 

has a single real positive solution; denote this solution by x v . 
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Define ap(x) 4 ^« and b p (x) 4 ^ 

Let x = (xi, X2, X3). For a multivariate function /(x), denote a/(x) to be a 3-element vector whose 

9/ 



elements are df^ = (^ygj-J- Let C/(x) denote a 3 x 3 matrix whose elements are given by 

The concentration result is as follows. The number of stopping sets S^ N in a randomly selected code C 
satisfies 

p r (l_ e <2k<i + ^) >i_^£ + (i) (37) 

S V N 



where 



= bp(xr,)Vdri(l - v)a c (r] 2 ) 1 

^\C^ Xjl ,xlx v )\(r,\l- V f - (c- l>2(r,2)) 

CTc2(??2) = cd|(-l, 1, -1) • ^(x^x^x,)"! • (-1, 1, 
f?(x) = S(xi,x 2 ,x 3j d) 

and B(-,-,-,d) is defined in ( fl~9l ). 
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